Dataset statistics
| Number of variables | 23 |
|---|---|
| Number of observations | 1481915 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.3 GiB |
| Average record size in memory | 933.0 B |
Variable types
| Numeric | 10 |
|---|---|
| DateTime | 2 |
| Text | 8 |
| Categorical | 3 |
zip is highly overall correlated with long and 1 other fields | High correlation |
lat is highly overall correlated with merch_lat | High correlation |
long is highly overall correlated with zip and 1 other fields | High correlation |
merch_lat is highly overall correlated with lat | High correlation |
merch_long is highly overall correlated with zip and 1 other fields | High correlation |
is_fraud is highly imbalanced (95.3%) | Imbalance |
amt is highly skewed (γ1 = 43.56282153) | Skewed |
trans_num has unique values | Unique |
Reproduction
| Analysis started | 2023-10-13 04:34:13.187562 |
|---|---|
| Analysis finished | 2023-10-13 04:37:23.877652 |
| Duration | 3 minutes and 10.69 seconds |
| Software version | ydata-profiling vv4.6.0 |
| Download configuration | config.json |
Unnamed: 0
Real number (ℝ)
| Distinct | 1126161 |
|---|---|
| Distinct (%) | 76.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 537337.46 |
| Minimum | 0 |
|---|---|
| Maximum | 1296674 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 46297 |
| Q1 | 231654.5 |
| median | 463235 |
| Q3 | 833691.5 |
| 95-th percentile | 1204169.6 |
| Maximum | 1296674 |
| Range | 1296674 |
| Interquartile range (IQR) | 602037 |
Descriptive statistics
| Standard deviation | 366977.01 |
|---|---|
| Coefficient of variation (CV) | 0.68295444 |
| Kurtosis | -0.96196499 |
| Mean | 537337.46 |
| Median Absolute Deviation (MAD) | 277853 |
| Skewness | 0.45382814 |
| Sum | 7.9628845 × 1011 |
| Variance | 1.3467212 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 461738 | 2 | < 0.1% |
| 161354 | 2 | < 0.1% |
| 357478 | 2 | < 0.1% |
| 117678 | 2 | < 0.1% |
| 78809 | 2 | < 0.1% |
| 221230 | 2 | < 0.1% |
| 43433 | 2 | < 0.1% |
| 262895 | 2 | < 0.1% |
| 322867 | 2 | < 0.1% |
| 213961 | 2 | < 0.1% |
| Other values (1126151) | 1481895 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 1 | 2 | |
| 2 | 1 | |
| 3 | 2 | |
| 4 | 2 | |
| 5 | 2 | |
| 6 | 2 | |
| 7 | 2 | |
| 8 | 1 | |
| 9 | 2 |
| Value | Count | Frequency (%) |
| 1296674 | 1 | |
| 1296673 | 1 | |
| 1296672 | 1 | |
| 1296671 | 1 | |
| 1296669 | 1 | |
| 1296668 | 1 | |
| 1296667 | 1 | |
| 1296666 | 1 | |
| 1296665 | 1 | |
| 1296663 | 1 |
| Distinct | 1460892 |
|---|---|
| Distinct (%) | 98.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 11.3 MiB |
| Minimum | 2019-01-01 00:00:18 |
|---|---|
| Maximum | 2020-12-31 23:59:34 |
cc_num
Real number (ℝ)
| Distinct | 999 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.1749841 × 1017 |
| Minimum | 6.0416207 × 1010 |
|---|---|
| Maximum | 4.9923464 × 1018 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | 6.0416207 × 1010 |
|---|---|
| 5-th percentile | 6.3048488 × 1011 |
| Q1 | 1.8004295 × 1014 |
| median | 3.5214173 × 1015 |
| Q3 | 4.6422555 × 1015 |
| 95-th percentile | 4.497914 × 1018 |
| Maximum | 4.9923464 × 1018 |
| Range | 4.9923463 × 1018 |
| Interquartile range (IQR) | 4.4622125 × 1015 |
Descriptive statistics
| Standard deviation | 1.309315 × 1018 |
|---|---|
| Coefficient of variation (CV) | 3.1360958 |
| Kurtosis | 6.1729868 |
| Mean | 4.1749841 × 1017 |
| Median Absolute Deviation (MAD) | 3.0764709 × 1015 |
| Skewness | 2.8506603 |
| Sum | -6.633861 × 1018 |
| Variance | 1.7143058 × 1036 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3.672269902 × 1013 | 3559 | 0.2% |
| 6.304249875 × 1011 | 3547 | 0.2% |
| 3.459339645 × 1014 | 3545 | 0.2% |
| 4.642255475 × 1015 | 3540 | 0.2% |
| 4.364010865 × 1015 | 3535 | 0.2% |
| 2.712209726 × 1015 | 3530 | 0.2% |
| 2.131124026 × 1014 | 3530 | 0.2% |
| 4.716561797 × 1015 | 3527 | 0.2% |
| 3.575789282 × 1015 | 3526 | 0.2% |
| 6.011438889 × 1015 | 3526 | 0.2% |
| Other values (989) | 1446550 |
| Value | Count | Frequency (%) |
| 6.041620718 × 1010 | 1754 | |
| 6.042292873 × 1010 | 1769 | |
| 6.042309813 × 1010 | 583 | < 0.1% |
| 6.042785159 × 1010 | 608 | < 0.1% |
| 6.048700208 × 1010 | 603 | < 0.1% |
| 6.04905963 × 1010 | 1156 | |
| 6.049559311 × 1010 | 599 | < 0.1% |
| 5.018029536 × 1011 | 1762 | |
| 5.018181333 × 1011 | 7 | < 0.1% |
| 5.018282048 × 1011 | 590 | < 0.1% |
| Value | Count | Frequency (%) |
| 4.992346398 × 1018 | 2330 | |
| 4.989847571 × 1018 | 1180 | 0.1% |
| 4.980323468 × 1018 | 605 | < 0.1% |
| 4.973530368 × 1018 | 1150 | 0.1% |
| 4.958589672 × 1018 | 1757 | |
| 4.95682899 × 1018 | 2984 | |
| 4.911818931 × 1018 | 7 | < 0.1% |
| 4.906628656 × 1018 | 2922 | |
| 4.897067971 × 1018 | 1183 | 0.1% |
| 4.890424427 × 1018 | 1747 |
merchant
Text
| Distinct | 693 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 113.2 MiB |
Length
| Max length | 43 |
|---|---|
| Median length | 36 |
| Mean length | 23.128412 |
| Min length | 13 |
Characters and Unicode
| Total characters | 34274340 |
|---|---|
| Distinct characters | 55 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | fraud_Haag-Blanda |
|---|---|
| 2nd row | fraud_Pouros-Conroy |
| 3rd row | fraud_Wiza LLC |
| 4th row | fraud_Jast Ltd |
| 5th row | fraud_Pouros-Conroy |
| Value | Count | Frequency (%) |
| and | 541748 | 15.7% |
| llc | 111493 | 3.2% |
| inc | 105105 | 3.0% |
| sons | 83735 | 2.4% |
| ltd | 80863 | 2.3% |
| plc | 75707 | 2.2% |
| group | 57806 | 1.7% |
| fraud_kutch | 11958 | 0.3% |
| fraud_schaefer | 10693 | 0.3% |
| fraud_streich | 10597 | 0.3% |
| Other values (804) | 2364693 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 3326512 | 9.7% |
| r | 3080922 | 9.0% |
| d | 2444700 | 7.1% |
| e | 2131357 | 6.2% |
| u | 2123007 | 6.2% |
| n | 2021193 | 5.9% |
| 1972483 | 5.8% | |
| f | 1596911 | 4.7% |
| _ | 1481915 | 4.3% |
| o | 1291119 | 3.8% |
| Other values (45) | 12804221 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 25936890 | |
| Uppercase Letter | 3882622 | 11.3% |
| Space Separator | 1972483 | 5.8% |
| Connector Punctuation | 1481915 | 4.3% |
| Dash Punctuation | 509193 | 1.5% |
| Other Punctuation | 491237 | 1.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 3326512 | |
| r | 3080922 | |
| d | 2444700 | |
| e | 2131357 | 8.2% |
| u | 2123007 | 8.2% |
| n | 2021193 | 7.8% |
| f | 1596911 | 6.2% |
| o | 1291119 | 5.0% |
| i | 1234035 | 4.8% |
| t | 997994 | 3.8% |
| Other values (15) | 5689140 |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 544396 | |
| C | 356370 | 9.2% |
| S | 344804 | 8.9% |
| B | 318777 | 8.2% |
| H | 298336 | 7.7% |
| K | 247906 | 6.4% |
| G | 219718 | 5.7% |
| R | 207248 | 5.3% |
| M | 204609 | 5.3% |
| P | 182171 | 4.7% |
| Other values (15) | 958287 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 458013 | |
| ' | 33224 | 6.8% |
Space Separator
| Value | Count | Frequency (%) |
| 1972483 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1481915 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 509193 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 29819512 | |
| Common | 4454828 | 13.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 3326512 | 11.2% |
| r | 3080922 | 10.3% |
| d | 2444700 | 8.2% |
| e | 2131357 | 7.1% |
| u | 2123007 | 7.1% |
| n | 2021193 | 6.8% |
| f | 1596911 | 5.4% |
| o | 1291119 | 4.3% |
| i | 1234035 | 4.1% |
| t | 997994 | 3.3% |
| Other values (40) | 9571762 |
Common
| Value | Count | Frequency (%) |
| 1972483 | ||
| _ | 1481915 | |
| - | 509193 | 11.4% |
| , | 458013 | 10.3% |
| ' | 33224 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 34274340 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 3326512 | 9.7% |
| r | 3080922 | 9.0% |
| d | 2444700 | 7.1% |
| e | 2131357 | 6.2% |
| u | 2123007 | 6.2% |
| n | 2021193 | 5.9% |
| 1972483 | 5.8% | |
| f | 1596911 | 4.7% |
| _ | 1481915 | 4.3% |
| o | 1291119 | 3.8% |
| Other values (45) | 12804221 |
category
Categorical
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 95.4 MiB |
| gas_transport | |
|---|---|
| grocery_pos | |
| home | |
| shopping_pos | |
| kids_pets | |
| Other values (9) |
Length
| Max length | 14 |
|---|---|
| Median length | 12 |
| Mean length | 10.526853 |
| Min length | 4 |
Characters and Unicode
| Total characters | 15599902 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | food_dining |
|---|---|
| 2nd row | shopping_pos |
| 3rd row | misc_pos |
| 4th row | shopping_net |
| 5th row | shopping_pos |
Common Values
| Value | Count | Frequency (%) |
| gas_transport | 150320 | |
| grocery_pos | 140903 | |
| home | 140289 | |
| shopping_pos | 133196 | |
| kids_pets | 129324 | |
| shopping_net | 111281 | |
| entertainment | 107386 | |
| food_dining | 104776 | 7.1% |
| personal_care | 104118 | 7.0% |
| health_fitness | 98276 | 6.6% |
| Other values (4) | 262046 |
Length
| Value | Count | Frequency (%) |
| gas_transport | 150320 | |
| grocery_pos | 140903 | |
| home | 140289 | |
| shopping_pos | 133196 | |
| kids_pets | 129324 | |
| shopping_net | 111281 | |
| entertainment | 107386 | |
| food_dining | 104776 | 7.1% |
| personal_care | 104118 | 7.0% |
| health_fitness | 98276 | 6.6% |
| Other values (4) | 262046 |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 1633613 | |
| e | 1471362 | |
| o | 1406839 | |
| n | 1364580 | |
| p | 1238069 | 7.9% |
| t | 1230747 | 7.9% |
| _ | 1187846 | 7.6% |
| r | 1048116 | 6.7% |
| i | 952840 | 6.1% |
| a | 760932 | 4.9% |
| Other values (10) | 3304958 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 14412056 | |
| Connector Punctuation | 1187846 | 7.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 1633613 | |
| e | 1471362 | |
| o | 1406839 | |
| n | 1364580 | |
| p | 1238069 | |
| t | 1230747 | |
| r | 1048116 | |
| i | 952840 | 6.6% |
| a | 760932 | 5.3% |
| g | 692303 | 4.8% |
| Other values (9) | 2612655 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1187846 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 14412056 | |
| Common | 1187846 | 7.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 1633613 | |
| e | 1471362 | |
| o | 1406839 | |
| n | 1364580 | |
| p | 1238069 | |
| t | 1230747 | |
| r | 1048116 | |
| i | 952840 | 6.6% |
| a | 760932 | 5.3% |
| g | 692303 | 4.8% |
| Other values (9) | 2612655 |
Common
| Value | Count | Frequency (%) |
| _ | 1187846 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15599902 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 1633613 | |
| e | 1471362 | |
| o | 1406839 | |
| n | 1364580 | |
| p | 1238069 | 7.9% |
| t | 1230747 | 7.9% |
| _ | 1187846 | 7.6% |
| r | 1048116 | 6.7% |
| i | 952840 | 6.1% |
| a | 760932 | 4.9% |
| Other values (10) | 3304958 |
amt
Real number (ℝ)
SKEWED 
| Distinct | 55373 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 70.01506 |
| Minimum | 1 |
|---|---|
| Maximum | 28948.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2.44 |
| Q1 | 9.64 |
| median | 47.44 |
| Q3 | 83.09 |
| 95-th percentile | 195.34 |
| Maximum | 28948.9 |
| Range | 28947.9 |
| Interquartile range (IQR) | 73.45 |
Descriptive statistics
| Standard deviation | 160.63 |
|---|---|
| Coefficient of variation (CV) | 2.2942206 |
| Kurtosis | 4687.9012 |
| Mean | 70.01506 |
| Median Absolute Deviation (MAD) | 37.45 |
| Skewness | 43.562822 |
| Sum | 1.0375637 × 108 |
| Variance | 25801.996 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.14 | 642 | < 0.1% |
| 1.01 | 610 | < 0.1% |
| 1.08 | 610 | < 0.1% |
| 1.2 | 600 | < 0.1% |
| 1.1 | 599 | < 0.1% |
| 1.25 | 592 | < 0.1% |
| 1.02 | 590 | < 0.1% |
| 1.03 | 589 | < 0.1% |
| 1.04 | 588 | < 0.1% |
| 1.16 | 586 | < 0.1% |
| Other values (55363) | 1475909 |
| Value | Count | Frequency (%) |
| 1 | 267 | |
| 1.01 | 610 | |
| 1.02 | 590 | |
| 1.03 | 589 | |
| 1.04 | 588 | |
| 1.05 | 572 | |
| 1.06 | 527 | |
| 1.07 | 578 | |
| 1.08 | 610 | |
| 1.09 | 579 |
| Value | Count | Frequency (%) |
| 28948.9 | 1 | |
| 27390.12 | 1 | |
| 27119.77 | 1 | |
| 26544.12 | 1 | |
| 25086.94 | 1 | |
| 22768.11 | 1 | |
| 21437.71 | 1 | |
| 19364.91 | 1 | |
| 17897.24 | 1 | |
| 16837.08 | 1 |
first
Text
| Distinct | 355 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 89.1 MiB |
Length
| Max length | 11 |
|---|---|
| Median length | 9 |
| Mean length | 6.0800188 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9010071 |
|---|---|
| Distinct characters | 49 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Kristen |
|---|---|
| 2nd row | Mary |
| 3rd row | Rebecca |
| 4th row | Adam |
| 5th row | Dawn |
| Value | Count | Frequency (%) |
| christopher | 30432 | 2.1% |
| robert | 24579 | 1.7% |
| jessica | 23499 | 1.6% |
| david | 22943 | 1.5% |
| james | 22928 | 1.5% |
| michael | 22865 | 1.5% |
| jennifer | 19415 | 1.3% |
| william | 18749 | 1.3% |
| john | 18674 | 1.3% |
| mary | 18650 | 1.3% |
| Other values (345) | 1259181 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1150721 | 12.8% |
| e | 984435 | 10.9% |
| i | 707020 | 7.8% |
| n | 702293 | 7.8% |
| r | 693648 | 7.7% |
| l | 443775 | 4.9% |
| h | 394646 | 4.4% |
| s | 371004 | 4.1% |
| t | 356040 | 4.0% |
| o | 307422 | 3.4% |
| Other values (39) | 2899067 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7528156 | |
| Uppercase Letter | 1481915 | 16.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 1150721 | |
| e | 984435 | |
| i | 707020 | |
| n | 702293 | |
| r | 693648 | |
| l | 443775 | 5.9% |
| h | 394646 | 5.2% |
| s | 371004 | 4.9% |
| t | 356040 | 4.7% |
| o | 307422 | 4.1% |
| Other values (16) | 1417152 |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 250298 | |
| M | 165626 | |
| S | 130907 | |
| A | 128963 | |
| C | 121223 | |
| D | 98504 | 6.6% |
| K | 97781 | 6.6% |
| R | 80237 | 5.4% |
| T | 76123 | 5.1% |
| L | 72046 | 4.9% |
| Other values (13) | 260207 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9010071 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 1150721 | 12.8% |
| e | 984435 | 10.9% |
| i | 707020 | 7.8% |
| n | 702293 | 7.8% |
| r | 693648 | 7.7% |
| l | 443775 | 4.9% |
| h | 394646 | 4.4% |
| s | 371004 | 4.1% |
| t | 356040 | 4.0% |
| o | 307422 | 3.4% |
| Other values (39) | 2899067 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9010071 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 1150721 | 12.8% |
| e | 984435 | 10.9% |
| i | 707020 | 7.8% |
| n | 702293 | 7.8% |
| r | 693648 | 7.7% |
| l | 443775 | 4.9% |
| h | 394646 | 4.4% |
| s | 371004 | 4.1% |
| t | 356040 | 4.0% |
| o | 307422 | 3.4% |
| Other values (39) | 2899067 |
last
Text
| Distinct | 486 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 89.2 MiB |
Length
| Max length | 11 |
|---|---|
| Median length | 10 |
| Mean length | 6.1118411 |
| Min length | 2 |
Characters and Unicode
| Total characters | 9057229 |
|---|---|
| Distinct characters | 48 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Allen |
|---|---|
| 2nd row | Wall |
| 3rd row | Obrien |
| 4th row | Stark |
| 5th row | Gray |
| Value | Count | Frequency (%) |
| smith | 32816 | 2.2% |
| williams | 26944 | 1.8% |
| davis | 25196 | 1.7% |
| johnson | 22848 | 1.5% |
| rodriguez | 19926 | 1.3% |
| martinez | 16989 | 1.1% |
| jones | 15911 | 1.1% |
| lewis | 14581 | 1.0% |
| gonzalez | 13464 | 0.9% |
| miller | 13444 | 0.9% |
| Other values (476) | 1279796 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 898167 | 9.9% |
| r | 753288 | 8.3% |
| a | 740913 | 8.2% |
| n | 695784 | 7.7% |
| o | 666291 | 7.4% |
| l | 559085 | 6.2% |
| s | 557628 | 6.2% |
| i | 498090 | 5.5% |
| t | 330090 | 3.6% |
| h | 261905 | 2.9% |
| Other values (38) | 3095988 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7575314 | |
| Uppercase Letter | 1481915 | 16.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 898167 | |
| r | 753288 | |
| a | 740913 | |
| n | 695784 | |
| o | 666291 | |
| l | 559085 | 7.4% |
| s | 557628 | 7.4% |
| i | 498090 | 6.6% |
| t | 330090 | 4.4% |
| h | 261905 | 3.5% |
| Other values (15) | 1614073 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 181519 | |
| W | 121838 | 8.2% |
| S | 120053 | 8.1% |
| C | 106662 | 7.2% |
| B | 96273 | 6.5% |
| R | 94844 | 6.4% |
| H | 93004 | 6.3% |
| G | 86463 | 5.8% |
| J | 82121 | 5.5% |
| P | 75564 | 5.1% |
| Other values (13) | 423574 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9057229 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 898167 | 9.9% |
| r | 753288 | 8.3% |
| a | 740913 | 8.2% |
| n | 695784 | 7.7% |
| o | 666291 | 7.4% |
| l | 559085 | 6.2% |
| s | 557628 | 6.2% |
| i | 498090 | 5.5% |
| t | 330090 | 3.6% |
| h | 261905 | 2.9% |
| Other values (38) | 3095988 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9057229 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 898167 | 9.9% |
| r | 753288 | 8.3% |
| a | 740913 | 8.2% |
| n | 695784 | 7.7% |
| o | 666291 | 7.4% |
| l | 559085 | 6.2% |
| s | 557628 | 6.2% |
| i | 498090 | 5.5% |
| t | 330090 | 3.6% |
| h | 261905 | 2.9% |
| Other values (38) | 3095988 |
gender
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 82.0 MiB |
| F | |
|---|---|
| M |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1481915 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | F |
|---|---|
| 2nd row | F |
| 3rd row | F |
| 4th row | M |
| 5th row | F |
Common Values
| Value | Count | Frequency (%) |
| F | 811696 | |
| M | 670219 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| f | 811696 | |
| m | 670219 |
Most occurring characters
| Value | Count | Frequency (%) |
| F | 811696 | |
| M | 670219 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 1481915 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 811696 | |
| M | 670219 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1481915 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 811696 | |
| M | 670219 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1481915 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| F | 811696 | |
| M | 670219 |
street
Text
| Distinct | 999 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 112.0 MiB |
Length
| Max length | 35 |
|---|---|
| Median length | 29 |
| Mean length | 22.231728 |
| Min length | 12 |
Characters and Unicode
| Total characters | 32945531 |
|---|---|
| Distinct characters | 62 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 8619 Lisa Manors Apt. 871 |
|---|---|
| 2nd row | 2481 Mills Lock |
| 3rd row | 5619 Mendoza Inlet |
| 4th row | 0912 Mark Fields Apt. 080 |
| 5th row | 9486 Joel Common Suite 554 |
| Value | Count | Frequency (%) |
| apt | 375057 | 6.4% |
| suite | 349293 | 5.9% |
| island | 26310 | 0.4% |
| michael | 21607 | 0.4% |
| islands | 20496 | 0.3% |
| common | 20482 | 0.3% |
| station | 20455 | 0.3% |
| david | 19914 | 0.3% |
| brooks | 19301 | 0.3% |
| fields | 18702 | 0.3% |
| Other values (1959) | 5002828 |
Most occurring characters
| Value | Count | Frequency (%) |
| 4412530 | 13.4% | |
| e | 2048222 | 6.2% |
| a | 1662290 | 5.0% |
| i | 1480942 | 4.5% |
| t | 1425355 | 4.3% |
| r | 1261206 | 3.8% |
| n | 1219452 | 3.7% |
| s | 1181368 | 3.6% |
| l | 1016532 | 3.1% |
| o | 1000689 | 3.0% |
| Other values (52) | 16236945 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 16471821 | |
| Decimal Number | 7997943 | |
| Space Separator | 4412530 | 13.4% |
| Uppercase Letter | 3688180 | 11.2% |
| Other Punctuation | 375057 | 1.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2048222 | |
| a | 1662290 | |
| i | 1480942 | |
| t | 1425355 | |
| r | 1261206 | 7.7% |
| n | 1219452 | 7.4% |
| s | 1181368 | 7.2% |
| l | 1016532 | 6.2% |
| o | 1000689 | 6.1% |
| u | 701368 | 4.3% |
| Other values (16) | 3474397 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 642045 | |
| A | 482819 | |
| M | 294697 | 8.0% |
| C | 255165 | 6.9% |
| P | 223938 | 6.1% |
| R | 213248 | 5.8% |
| B | 169409 | 4.6% |
| F | 163860 | 4.4% |
| L | 150561 | 4.1% |
| J | 138640 | 3.8% |
| Other values (14) | 953798 |
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 855536 | |
| 3 | 846197 | |
| 2 | 840260 | |
| 7 | 803451 | |
| 1 | 792693 | |
| 8 | 790925 | |
| 0 | 774792 | |
| 6 | 774382 | |
| 4 | 766890 | |
| 9 | 752817 |
Space Separator
| Value | Count | Frequency (%) |
| 4412530 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 375057 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 20160001 | |
| Common | 12785530 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 2048222 | 10.2% |
| a | 1662290 | 8.2% |
| i | 1480942 | 7.3% |
| t | 1425355 | 7.1% |
| r | 1261206 | 6.3% |
| n | 1219452 | 6.0% |
| s | 1181368 | 5.9% |
| l | 1016532 | 5.0% |
| o | 1000689 | 5.0% |
| u | 701368 | 3.5% |
| Other values (40) | 7162577 |
Common
| Value | Count | Frequency (%) |
| 4412530 | ||
| 5 | 855536 | 6.7% |
| 3 | 846197 | 6.6% |
| 2 | 840260 | 6.6% |
| 7 | 803451 | 6.3% |
| 1 | 792693 | 6.2% |
| 8 | 790925 | 6.2% |
| 0 | 774792 | 6.1% |
| 6 | 774382 | 6.1% |
| 4 | 766890 | 6.0% |
| Other values (2) | 1127874 | 8.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 32945531 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4412530 | 13.4% | |
| e | 2048222 | 6.2% |
| a | 1662290 | 5.0% |
| i | 1480942 | 4.5% |
| t | 1425355 | 4.3% |
| r | 1261206 | 3.8% |
| n | 1219452 | 3.7% |
| s | 1181368 | 3.6% |
| l | 1016532 | 3.1% |
| o | 1000689 | 3.0% |
| Other values (52) | 16236945 |
city
Text
| Distinct | 906 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 92.8 MiB |
Length
| Max length | 25 |
|---|---|
| Median length | 21 |
| Mean length | 8.652312 |
| Min length | 3 |
Characters and Unicode
| Total characters | 12821991 |
|---|---|
| Distinct characters | 52 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Lagrange |
|---|---|
| 2nd row | Plainfield |
| 3rd row | Juliette |
| 4th row | Mc Veytown |
| 5th row | Topeka |
| Value | Count | Frequency (%) |
| city | 24620 | 1.3% |
| west | 22253 | 1.2% |
| north | 16480 | 0.9% |
| saint | 16380 | 0.9% |
| falls | 14687 | 0.8% |
| new | 13480 | 0.7% |
| lake | 12928 | 0.7% |
| mount | 12889 | 0.7% |
| san | 11756 | 0.6% |
| springs | 9947 | 0.5% |
| Other values (929) | 1694492 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1245001 | 9.7% |
| a | 1068358 | 8.3% |
| n | 939180 | 7.3% |
| o | 934885 | 7.3% |
| l | 891781 | 7.0% |
| r | 856686 | 6.7% |
| i | 804866 | 6.3% |
| t | 684441 | 5.3% |
| s | 510055 | 4.0% |
| 367997 | 2.9% | |
| Other values (42) | 4518741 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10601722 | |
| Uppercase Letter | 1851092 | 14.4% |
| Space Separator | 367997 | 2.9% |
| Dash Punctuation | 1180 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1245001 | |
| a | 1068358 | |
| n | 939180 | |
| o | 934885 | |
| l | 891781 | 8.4% |
| r | 856686 | 8.1% |
| i | 804866 | 7.6% |
| t | 684441 | 6.5% |
| s | 510055 | 4.8% |
| d | 353626 | 3.3% |
| Other values (15) | 2312843 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 179294 | 9.7% |
| M | 169130 | 9.1% |
| S | 155035 | 8.4% |
| B | 152323 | 8.2% |
| H | 132196 | 7.1% |
| W | 108953 | 5.9% |
| P | 105407 | 5.7% |
| L | 99000 | 5.3% |
| R | 90720 | 4.9% |
| A | 85414 | 4.6% |
| Other values (15) | 573620 |
Space Separator
| Value | Count | Frequency (%) |
| 367997 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1180 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12452814 | |
| Common | 369177 | 2.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1245001 | 10.0% |
| a | 1068358 | 8.6% |
| n | 939180 | 7.5% |
| o | 934885 | 7.5% |
| l | 891781 | 7.2% |
| r | 856686 | 6.9% |
| i | 804866 | 6.5% |
| t | 684441 | 5.5% |
| s | 510055 | 4.1% |
| d | 353626 | 2.8% |
| Other values (40) | 4163935 |
Common
| Value | Count | Frequency (%) |
| 367997 | ||
| - | 1180 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12821991 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1245001 | 9.7% |
| a | 1068358 | 8.3% |
| n | 939180 | 7.3% |
| o | 934885 | 7.3% |
| l | 891781 | 7.0% |
| r | 856686 | 6.7% |
| i | 804866 | 6.3% |
| t | 684441 | 5.3% |
| s | 510055 | 4.0% |
| 367997 | 2.9% | |
| Other values (42) | 4518741 |
state
Text
| Distinct | 51 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 83.4 MiB |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 2963830 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | WY |
|---|---|
| 2nd row | NJ |
| 3rd row | GA |
| 4th row | PA |
| 5th row | KS |
| Value | Count | Frequency (%) |
| tx | 108194 | 7.3% |
| ny | 95669 | 6.5% |
| pa | 91463 | 6.2% |
| ca | 64402 | 4.3% |
| oh | 53400 | 3.6% |
| mi | 52566 | 3.5% |
| il | 49669 | 3.4% |
| fl | 48582 | 3.3% |
| al | 46799 | 3.2% |
| mo | 43861 | 3.0% |
| Other values (41) | 827310 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 406997 | |
| N | 325088 | 11.0% |
| M | 251295 | 8.5% |
| I | 208184 | 7.0% |
| T | 176141 | 5.9% |
| L | 169150 | 5.7% |
| O | 164723 | 5.6% |
| C | 161112 | 5.4% |
| Y | 150642 | 5.1% |
| X | 108194 | 3.7% |
| Other values (14) | 842304 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2963830 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 406997 | |
| N | 325088 | 11.0% |
| M | 251295 | 8.5% |
| I | 208184 | 7.0% |
| T | 176141 | 5.9% |
| L | 169150 | 5.7% |
| O | 164723 | 5.6% |
| C | 161112 | 5.4% |
| Y | 150642 | 5.1% |
| X | 108194 | 3.7% |
| Other values (14) | 842304 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2963830 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 406997 | |
| N | 325088 | 11.0% |
| M | 251295 | 8.5% |
| I | 208184 | 7.0% |
| T | 176141 | 5.9% |
| L | 169150 | 5.7% |
| O | 164723 | 5.6% |
| C | 161112 | 5.4% |
| Y | 150642 | 5.1% |
| X | 108194 | 3.7% |
| Other values (14) | 842304 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2963830 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 406997 | |
| N | 325088 | 11.0% |
| M | 251295 | 8.5% |
| I | 208184 | 7.0% |
| T | 176141 | 5.9% |
| L | 169150 | 5.7% |
| O | 164723 | 5.6% |
| C | 161112 | 5.4% |
| Y | 150642 | 5.1% |
| X | 108194 | 3.7% |
| Other values (14) | 842304 |
zip
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 985 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 48814.501 |
| Minimum | 1257 |
|---|---|
| Maximum | 99921 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | 1257 |
|---|---|
| 5-th percentile | 7208 |
| Q1 | 26237 |
| median | 48174 |
| Q3 | 72011 |
| 95-th percentile | 94569 |
| Maximum | 99921 |
| Range | 98664 |
| Interquartile range (IQR) | 45774 |
Descriptive statistics
| Standard deviation | 26879.732 |
|---|---|
| Coefficient of variation (CV) | 0.55065054 |
| Kurtosis | -1.0964545 |
| Mean | 48814.501 |
| Median Absolute Deviation (MAD) | 23068 |
| Skewness | 0.079134573 |
| Sum | 7.2338942 × 1010 |
| Variance | 7.2251997 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 73754 | 4090 | 0.3% |
| 34112 | 4069 | 0.3% |
| 82514 | 4062 | 0.3% |
| 48088 | 4055 | 0.3% |
| 85173 | 3559 | 0.2% |
| 26292 | 3547 | 0.2% |
| 21872 | 3545 | 0.2% |
| 84540 | 3540 | 0.2% |
| 89512 | 3535 | 0.2% |
| 29819 | 3530 | 0.2% |
| Other values (975) | 1444383 |
| Value | Count | Frequency (%) |
| 1257 | 2313 | |
| 1330 | 1194 | |
| 1535 | 592 | < 0.1% |
| 1545 | 1144 | 0.1% |
| 1612 | 608 | < 0.1% |
| 1843 | 2885 | |
| 1844 | 2296 | |
| 2180 | 601 | < 0.1% |
| 2630 | 2364 | |
| 2908 | 594 | < 0.1% |
| Value | Count | Frequency (%) |
| 99921 | 12 | < 0.1% |
| 99783 | 1753 | |
| 99747 | 11 | < 0.1% |
| 99746 | 572 | < 0.1% |
| 99323 | 2940 | |
| 99160 | 3499 | |
| 99116 | 12 | < 0.1% |
| 99113 | 1185 | 0.1% |
| 99033 | 2882 | |
| 98836 | 595 | < 0.1% |
lat
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 983 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.537659 |
| Minimum | 20.0271 |
|---|---|
| Maximum | 66.6933 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | 20.0271 |
|---|---|
| 5-th percentile | 29.8826 |
| Q1 | 34.6689 |
| median | 39.3543 |
| Q3 | 41.8948 |
| 95-th percentile | 45.8433 |
| Maximum | 66.6933 |
| Range | 46.6662 |
| Interquartile range (IQR) | 7.2259 |
Descriptive statistics
| Standard deviation | 5.0704933 |
|---|---|
| Coefficient of variation (CV) | 0.13157243 |
| Kurtosis | 0.78272213 |
| Mean | 38.537659 |
| Median Absolute Deviation (MAD) | 3.3597 |
| Skewness | -0.19402462 |
| Sum | 57109535 |
| Variance | 25.709902 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 36.385 | 4090 | 0.3% |
| 26.1184 | 4069 | 0.3% |
| 43.0048 | 4062 | 0.3% |
| 42.5164 | 4055 | 0.3% |
| 33.2887 | 3559 | 0.2% |
| 39.1505 | 3547 | 0.2% |
| 38.4121 | 3545 | 0.2% |
| 38.9999 | 3540 | 0.2% |
| 39.5483 | 3535 | 0.2% |
| 34.0326 | 3530 | 0.2% |
| Other values (973) | 1444383 |
| Value | Count | Frequency (%) |
| 20.0271 | 1737 | |
| 20.0827 | 1187 | 0.1% |
| 24.6557 | 2922 | |
| 26.1184 | 4069 | |
| 26.3304 | 599 | < 0.1% |
| 26.3771 | 574 | < 0.1% |
| 26.4215 | 3498 | |
| 26.4722 | 2914 | |
| 26.529 | 1782 | |
| 26.6939 | 1170 | 0.1% |
| Value | Count | Frequency (%) |
| 66.6933 | 11 | < 0.1% |
| 65.6899 | 572 | < 0.1% |
| 64.7556 | 1753 | |
| 55.4732 | 12 | < 0.1% |
| 48.8878 | 3499 | |
| 48.8856 | 2338 | |
| 48.8328 | 1765 | |
| 48.6669 | 1180 | 0.1% |
| 48.6031 | 3505 | |
| 48.4786 | 2334 |
long
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 983 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -90.228007 |
| Minimum | -165.6723 |
|---|---|
| Maximum | -67.9503 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1481915 |
| Negative (%) | 100.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | -165.6723 |
|---|---|
| 5-th percentile | -119.0825 |
| Q1 | -96.798 |
| median | -87.4769 |
| Q3 | -80.158 |
| 95-th percentile | -73.5365 |
| Maximum | -67.9503 |
| Range | 97.722 |
| Interquartile range (IQR) | 16.64 |
Descriptive statistics
| Standard deviation | 13.745414 |
|---|---|
| Coefficient of variation (CV) | -0.15234088 |
| Kurtosis | 1.8315168 |
| Mean | -90.228007 |
| Median Absolute Deviation (MAD) | 8.1527 |
| Skewness | -1.145943 |
| Sum | -1.3371024 × 108 |
| Variance | 188.93641 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -98.0727 | 4090 | 0.3% |
| -81.7361 | 4069 | 0.3% |
| -108.8964 | 4062 | 0.3% |
| -82.9832 | 4055 | 0.3% |
| -111.0985 | 3559 | 0.2% |
| -79.503 | 3547 | 0.2% |
| -75.2811 | 3545 | 0.2% |
| -82.7243 | 3540 | 0.2% |
| -109.615 | 3540 | 0.2% |
| -119.7957 | 3535 | 0.2% |
| Other values (973) | 1444373 |
| Value | Count | Frequency (%) |
| -165.6723 | 1753 | |
| -156.292 | 572 | < 0.1% |
| -155.488 | 1187 | |
| -155.3697 | 1737 | |
| -153.994 | 11 | < 0.1% |
| -133.1171 | 12 | < 0.1% |
| -124.4409 | 1167 | |
| -124.2174 | 1740 | |
| -124.1587 | 1168 | |
| -124.1437 | 1764 |
| Value | Count | Frequency (%) |
| -67.9503 | 2303 | |
| -68.5565 | 1188 | 0.1% |
| -69.2675 | 592 | < 0.1% |
| -69.4828 | 2360 | |
| -69.9576 | 596 | < 0.1% |
| -69.9656 | 3482 | |
| -70.1031 | 6 | < 0.1% |
| -70.239 | 1137 | 0.1% |
| -70.3001 | 2364 | |
| -70.3457 | 1741 |
city_pop
Real number (ℝ)
| Distinct | 891 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 88571.477 |
| Minimum | 23 |
|---|---|
| Maximum | 2906700 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | 23 |
|---|---|
| 5-th percentile | 139 |
| Q1 | 741 |
| median | 2443 |
| Q3 | 20328 |
| 95-th percentile | 525713 |
| Maximum | 2906700 |
| Range | 2906677 |
| Interquartile range (IQR) | 19587 |
Descriptive statistics
| Standard deviation | 301272.82 |
|---|---|
| Coefficient of variation (CV) | 3.4014654 |
| Kurtosis | 37.564955 |
| Mean | 88571.477 |
| Median Absolute Deviation (MAD) | 2188 |
| Skewness | 5.5900914 |
| Sum | 1.312554 × 1011 |
| Variance | 9.076531 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 606 | 6352 | 0.4% |
| 1312922 | 5882 | 0.4% |
| 1595797 | 5881 | 0.4% |
| 241 | 5335 | 0.4% |
| 1766 | 5239 | 0.4% |
| 302 | 4712 | 0.3% |
| 198 | 4674 | 0.3% |
| 2135 | 4673 | 0.3% |
| 1126 | 4671 | 0.3% |
| 276002 | 4668 | 0.3% |
| Other values (881) | 1429828 |
| Value | Count | Frequency (%) |
| 23 | 2318 | |
| 37 | 1169 | 0.1% |
| 43 | 2363 | |
| 46 | 3540 | |
| 47 | 595 | < 0.1% |
| 49 | 1176 | 0.1% |
| 51 | 1149 | 0.1% |
| 52 | 593 | < 0.1% |
| 53 | 2920 | |
| 60 | 1184 | 0.1% |
| Value | Count | Frequency (%) |
| 2906700 | 4667 | |
| 2504700 | 2343 | 0.2% |
| 2383912 | 586 | < 0.1% |
| 1595797 | 5881 | |
| 1577385 | 2928 | |
| 1526206 | 4071 | |
| 1417793 | 8 | < 0.1% |
| 1382480 | 2346 | 0.2% |
| 1312922 | 5882 | |
| 1263321 | 4088 |
job
Text
| Distinct | 497 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 109.2 MiB |
Length
| Max length | 59 |
|---|---|
| Median length | 38 |
| Mean length | 20.23293 |
| Min length | 3 |
Characters and Unicode
| Total characters | 29983483 |
|---|---|
| Distinct characters | 53 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Product/process development scientist |
|---|---|
| 2nd row | Leisure centre manager |
| 3rd row | Theatre manager |
| 4th row | Nutritional therapist |
| 5th row | Secondary school teacher |
| Value | Count | Frequency (%) |
| engineer | 150561 | 4.6% |
| officer | 126893 | 3.9% |
| manager | 70214 | 2.1% |
| scientist | 63722 | 1.9% |
| designer | 59784 | 1.8% |
| surveyor | 56116 | 1.7% |
| teacher | 43778 | 1.3% |
| psychologist | 37628 | 1.1% |
| research | 33860 | 1.0% |
| editor | 32822 | 1.0% |
| Other values (457) | 2616002 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 3203585 | 10.7% |
| i | 2726816 | 9.1% |
| r | 2512106 | 8.4% |
| a | 2074000 | 6.9% |
| t | 2038492 | 6.8% |
| n | 2017188 | 6.7% |
| 1809465 | 6.0% | |
| o | 1706564 | 5.7% |
| s | 1651280 | 5.5% |
| c | 1512921 | 5.0% |
| Other values (43) | 8731066 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 26046581 | |
| Space Separator | 1809465 | 6.0% |
| Uppercase Letter | 1565403 | 5.2% |
| Other Punctuation | 506968 | 1.7% |
| Close Punctuation | 27533 | 0.1% |
| Open Punctuation | 27533 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 3203585 | |
| i | 2726816 | |
| r | 2512106 | |
| a | 2074000 | 8.0% |
| t | 2038492 | 7.8% |
| n | 2017188 | 7.7% |
| o | 1706564 | 6.6% |
| s | 1651280 | 6.3% |
| c | 1512921 | 5.8% |
| l | 1142898 | 4.4% |
| Other values (16) | 5460731 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 179610 | |
| E | 166236 | |
| P | 164114 | |
| S | 156833 | |
| T | 129604 | 8.3% |
| M | 101668 | 6.5% |
| A | 100772 | 6.4% |
| F | 78594 | 5.0% |
| D | 66218 | 4.2% |
| R | 63853 | 4.1% |
| Other values (11) | 357901 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 357196 | |
| / | 140938 | 27.8% |
| ' | 8834 | 1.7% |
Space Separator
| Value | Count | Frequency (%) |
| 1809465 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 27533 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 27533 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 27611984 | |
| Common | 2371499 | 7.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 3203585 | |
| i | 2726816 | 9.9% |
| r | 2512106 | 9.1% |
| a | 2074000 | 7.5% |
| t | 2038492 | 7.4% |
| n | 2017188 | 7.3% |
| o | 1706564 | 6.2% |
| s | 1651280 | 6.0% |
| c | 1512921 | 5.5% |
| l | 1142898 | 4.1% |
| Other values (37) | 7026134 |
Common
| Value | Count | Frequency (%) |
| 1809465 | ||
| , | 357196 | 15.1% |
| / | 140938 | 5.9% |
| ) | 27533 | 1.2% |
| ( | 27533 | 1.2% |
| ' | 8834 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 29983483 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 3203585 | 10.7% |
| i | 2726816 | 9.1% |
| r | 2512106 | 8.4% |
| a | 2074000 | 6.9% |
| t | 2038492 | 6.8% |
| n | 2017188 | 6.7% |
| 1809465 | 6.0% | |
| o | 1706564 | 5.7% |
| s | 1651280 | 5.5% |
| c | 1512921 | 5.0% |
| Other values (43) | 8731066 |
dob
Date
| Distinct | 984 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 11.3 MiB |
| Minimum | 1924-10-30 00:00:00 |
|---|---|
| Maximum | 2005-01-29 00:00:00 |
trans_num
Text
UNIQUE 
| Distinct | 1481915 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 125.8 MiB |
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 32 |
| Min length | 32 |
Characters and Unicode
| Total characters | 47421280 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1481915 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | fd19c51e0b694609f42034aa3bf1830a |
|---|---|
| 2nd row | 133cc647eb444e9344a5c15f6419ce23 |
| 3rd row | dcf293f8901483f86f9627dbd603e0cf |
| 4th row | bf7c695b05563c0259f175783b11811c |
| 5th row | 195277f89533d9a16ff4f3581b629c59 |
| Value | Count | Frequency (%) |
| fd19c51e0b694609f42034aa3bf1830a | 1 | < 0.1% |
| 195277f89533d9a16ff4f3581b629c59 | 1 | < 0.1% |
| 8ae9ac14b13f6f0f1ab262b01dbf007c | 1 | < 0.1% |
| 288d1ec2511320a8468523201f07ca48 | 1 | < 0.1% |
| 342c50d9c172c7693b729f995be2bd99 | 1 | < 0.1% |
| 4a9b5b066a9c3d0826875fe1381fdf73 | 1 | < 0.1% |
| c60bc926c338fd86b23171d39a6fe0d2 | 1 | < 0.1% |
| ed8bca1e004eb58b830d22e5f4ab7d08 | 1 | < 0.1% |
| e3a32d1c7052529ef76c4d46f40b053d | 1 | < 0.1% |
| f29d2a7bbf5f57c19118ae671054549d | 1 | < 0.1% |
| Other values (1481905) | 1481905 |
Most occurring characters
| Value | Count | Frequency (%) |
| 4 | 2967212 | 6.3% |
| 9 | 2966257 | 6.3% |
| 7 | 2965584 | 6.3% |
| 1 | 2965402 | 6.3% |
| 3 | 2965160 | 6.3% |
| 2 | 2964826 | 6.3% |
| d | 2964458 | 6.3% |
| a | 2963816 | 6.2% |
| f | 2963653 | 6.2% |
| 5 | 2963292 | 6.2% |
| Other values (6) | 17771620 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 29643962 | |
| Lowercase Letter | 17777318 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 2967212 | |
| 9 | 2966257 | |
| 7 | 2965584 | |
| 1 | 2965402 | |
| 3 | 2965160 | |
| 2 | 2964826 | |
| 5 | 2963292 | |
| 0 | 2962230 | |
| 8 | 2962041 | |
| 6 | 2961958 |
Lowercase Letter
| Value | Count | Frequency (%) |
| d | 2964458 | |
| a | 2963816 | |
| f | 2963653 | |
| c | 2963209 | |
| b | 2961300 | |
| e | 2960882 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 29643962 | |
| Latin | 17777318 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 4 | 2967212 | |
| 9 | 2966257 | |
| 7 | 2965584 | |
| 1 | 2965402 | |
| 3 | 2965160 | |
| 2 | 2964826 | |
| 5 | 2963292 | |
| 0 | 2962230 | |
| 8 | 2962041 | |
| 6 | 2961958 |
Latin
| Value | Count | Frequency (%) |
| d | 2964458 | |
| a | 2963816 | |
| f | 2963653 | |
| c | 2963209 | |
| b | 2961300 | |
| e | 2960882 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 47421280 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4 | 2967212 | 6.3% |
| 9 | 2966257 | 6.3% |
| 7 | 2965584 | 6.3% |
| 1 | 2965402 | 6.3% |
| 3 | 2965160 | 6.3% |
| 2 | 2964826 | 6.3% |
| d | 2964458 | 6.3% |
| a | 2963816 | 6.2% |
| f | 2963653 | 6.2% |
| 5 | 2963292 | 6.2% |
| Other values (6) | 17771620 |
unix_time
Real number (ℝ)
| Distinct | 1460914 |
|---|---|
| Distinct (%) | 98.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.3586718 × 109 |
| Minimum | 1.325376 × 109 |
|---|---|
| Maximum | 1.3885344 × 109 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | 1.325376 × 109 |
|---|---|
| 5-th percentile | 1.3301045 × 109 |
| Q1 | 1.343012 × 109 |
| median | 1.3570821 × 109 |
| Q3 | 1.3745664 × 109 |
| 95-th percentile | 1.3867823 × 109 |
| Maximum | 1.3885344 × 109 |
| Range | 63158356 |
| Interquartile range (IQR) | 31554376 |
Descriptive statistics
| Standard deviation | 18192852 |
|---|---|
| Coefficient of variation (CV) | 0.013390174 |
| Kurtosis | -1.1995227 |
| Mean | 1.3586718 × 109 |
| Median Absolute Deviation (MAD) | 15787347 |
| Skewness | -0.019417778 |
| Sum | 2.0134361 × 1015 |
| Variance | 3.3097986 × 1014 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1370177227 | 4 | < 0.1% |
| 1387312599 | 4 | < 0.1% |
| 1355835657 | 3 | < 0.1% |
| 1357998610 | 3 | < 0.1% |
| 1386274418 | 3 | < 0.1% |
| 1354985811 | 3 | < 0.1% |
| 1387062311 | 3 | < 0.1% |
| 1351896039 | 3 | < 0.1% |
| 1354458672 | 3 | < 0.1% |
| 1379203657 | 3 | < 0.1% |
| Other values (1460904) | 1481883 |
| Value | Count | Frequency (%) |
| 1325376018 | 1 | |
| 1325376044 | 1 | |
| 1325376051 | 1 | |
| 1325376076 | 1 | |
| 1325376186 | 1 | |
| 1325376248 | 1 | |
| 1325376282 | 1 | |
| 1325376308 | 1 | |
| 1325376361 | 1 | |
| 1325376383 | 1 |
| Value | Count | Frequency (%) |
| 1388534374 | 1 | |
| 1388534364 | 1 | |
| 1388534355 | 1 | |
| 1388534349 | 1 | |
| 1388534347 | 1 | |
| 1388534314 | 1 | |
| 1388534284 | 1 | |
| 1388534270 | 1 | |
| 1388534238 | 1 | |
| 1388534217 | 1 |
merch_lat
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 1418540 |
|---|---|
| Distinct (%) | 95.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.537131 |
| Minimum | 19.027422 |
|---|---|
| Maximum | 67.510267 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | 19.027422 |
|---|---|
| 5-th percentile | 29.751155 |
| Q1 | 34.741212 |
| median | 39.369006 |
| Q3 | 41.953798 |
| 95-th percentile | 46.004211 |
| Maximum | 67.510267 |
| Range | 48.482845 |
| Interquartile range (IQR) | 7.2125855 |
Descriptive statistics
| Standard deviation | 5.104556 |
|---|---|
| Coefficient of variation (CV) | 0.13245812 |
| Kurtosis | 0.76465667 |
| Mean | 38.537131 |
| Median Absolute Deviation (MAD) | 3.387247 |
| Skewness | -0.1903319 |
| Sum | 57108753 |
| Variance | 26.056492 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 41.204635 | 4 | < 0.1% |
| 39.818513 | 4 | < 0.1% |
| 43.682006 | 4 | < 0.1% |
| 36.545289 | 4 | < 0.1% |
| 39.516582 | 4 | < 0.1% |
| 39.734227 | 4 | < 0.1% |
| 33.995152 | 4 | < 0.1% |
| 38.47822 | 4 | < 0.1% |
| 41.51721 | 4 | < 0.1% |
| 37.810182 | 4 | < 0.1% |
| Other values (1418530) | 1481875 |
| Value | Count | Frequency (%) |
| 19.027422 | 1 | |
| 19.027785 | 1 | |
| 19.027804 | 1 | |
| 19.027849 | 1 | |
| 19.029798 | 1 | |
| 19.031242 | 1 | |
| 19.032277 | 1 | |
| 19.032689 | 1 | |
| 19.033288 | 1 | |
| 19.034282 | 1 |
| Value | Count | Frequency (%) |
| 67.510267 | 1 | |
| 67.441518 | 1 | |
| 67.397018 | 1 | |
| 67.188111 | 1 | |
| 67.064277 | 1 | |
| 66.835174 | 1 | |
| 66.682905 | 1 | |
| 66.679297 | 1 | |
| 66.67355 | 1 | |
| 66.67154 | 1 |
merch_long
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 1454618 |
|---|---|
| Distinct (%) | 98.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -90.228151 |
| Minimum | -166.67157 |
|---|---|
| Maximum | -66.950902 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1481915 |
| Negative (%) | 100.0% |
| Memory size | 11.3 MiB |
Quantile statistics
| Minimum | -166.67157 |
|---|---|
| 5-th percentile | -119.30615 |
| Q1 | -96.903202 |
| median | -87.43976 |
| Q3 | -80.245857 |
| 95-th percentile | -73.369183 |
| Maximum | -66.950902 |
| Range | 99.720673 |
| Interquartile range (IQR) | 16.657345 |
Descriptive statistics
| Standard deviation | 13.756976 |
|---|---|
| Coefficient of variation (CV) | -0.15246877 |
| Kurtosis | 1.8252507 |
| Mean | -90.228151 |
| Median Absolute Deviation (MAD) | 8.223868 |
| Skewness | -1.1429122 |
| Sum | -1.3371045 × 108 |
| Variance | 189.25438 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -73.900295 | 4 | < 0.1% |
| -96.511763 | 4 | < 0.1% |
| -74.618269 | 4 | < 0.1% |
| -92.521318 | 4 | < 0.1% |
| -80.893888 | 4 | < 0.1% |
| -81.888319 | 3 | < 0.1% |
| -80.285495 | 3 | < 0.1% |
| -88.524861 | 3 | < 0.1% |
| -85.971056 | 3 | < 0.1% |
| -81.746112 | 3 | < 0.1% |
| Other values (1454608) | 1481880 |
| Value | Count | Frequency (%) |
| -166.671575 | 1 | |
| -166.671242 | 1 | |
| -166.670006 | 1 | |
| -166.66991 | 1 | |
| -166.669638 | 1 | |
| -166.666179 | 1 | |
| -166.664828 | 1 | |
| -166.661968 | 1 | |
| -166.658797 | 1 | |
| -166.657834 | 1 |
| Value | Count | Frequency (%) |
| -66.950902 | 1 | |
| -66.952026 | 1 | |
| -66.952352 | 1 | |
| -66.955602 | 1 | |
| -66.95654 | 1 | |
| -66.957364 | 1 | |
| -66.958659 | 1 | |
| -66.958751 | 1 | |
| -66.959178 | 1 | |
| -66.959498 | 1 |
is_fraud
Categorical
IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 82.0 MiB |
| 0 | |
|---|---|
| 1 | 7768 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1481915 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1474147 | |
| 1 | 7768 | 0.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1474147 | |
| 1 | 7768 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1474147 | |
| 1 | 7768 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1481915 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1474147 | |
| 1 | 7768 | 0.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1481915 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1474147 | |
| 1 | 7768 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1481915 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1474147 | |
| 1 | 7768 | 0.5% |
| Unnamed: 0 | cc_num | amt | zip | lat | long | city_pop | unix_time | merch_lat | merch_long | category | gender | is_fraud | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Unnamed: 0 | 1.000 | 0.002 | 0.000 | 0.000 | -0.000 | -0.001 | -0.001 | 0.172 | -0.000 | -0.001 | 0.001 | 0.000 | 0.015 |
| cc_num | 0.002 | 1.000 | -0.001 | 0.014 | -0.003 | -0.014 | 0.049 | 0.001 | -0.004 | -0.014 | 0.009 | 0.052 | 0.002 |
| amt | 0.000 | -0.001 | 1.000 | 0.002 | 0.013 | -0.001 | -0.024 | -0.001 | 0.013 | -0.001 | 0.019 | 0.000 | 0.000 |
| zip | 0.000 | 0.014 | 0.002 | 1.000 | -0.162 | -0.959 | -0.040 | 0.001 | -0.162 | -0.957 | 0.011 | 0.116 | 0.005 |
| lat | -0.000 | -0.003 | 0.013 | -0.162 | 1.000 | 0.106 | -0.264 | 0.001 | 0.991 | 0.104 | 0.010 | 0.101 | 0.040 |
| long | -0.001 | -0.014 | -0.001 | -0.959 | 0.106 | 1.000 | 0.087 | -0.001 | 0.105 | 0.998 | 0.009 | 0.091 | 0.040 |
| city_pop | -0.001 | 0.049 | -0.024 | -0.040 | -0.264 | 0.087 | 1.000 | -0.003 | -0.263 | 0.086 | 0.014 | 0.089 | 0.002 |
| unix_time | 0.172 | 0.001 | -0.001 | 0.001 | 0.001 | -0.001 | -0.003 | 1.000 | 0.001 | -0.001 | 0.001 | 0.000 | 0.022 |
| merch_lat | -0.000 | -0.004 | 0.013 | -0.162 | 0.991 | 0.105 | -0.263 | 0.001 | 1.000 | 0.104 | 0.011 | 0.102 | 0.040 |
| merch_long | -0.001 | -0.014 | -0.001 | -0.957 | 0.104 | 0.998 | 0.086 | -0.001 | 0.104 | 1.000 | 0.009 | 0.082 | 0.040 |
| category | 0.001 | 0.009 | 0.019 | 0.011 | 0.010 | 0.009 | 0.014 | 0.001 | 0.011 | 0.009 | 1.000 | 0.054 | 0.067 |
| gender | 0.000 | 0.052 | 0.000 | 0.116 | 0.101 | 0.091 | 0.089 | 0.000 | 0.102 | 0.082 | 0.054 | 1.000 | 0.006 |
| is_fraud | 0.015 | 0.002 | 0.000 | 0.005 | 0.040 | 0.040 | 0.002 | 0.022 | 0.040 | 0.040 | 0.067 | 0.006 | 1.000 |
| Unnamed: 0 | trans_date_trans_time | cc_num | merchant | category | amt | first | last | gender | street | city | state | zip | lat | long | city_pop | job | dob | trans_num | unix_time | merch_lat | merch_long | is_fraud | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 461738 | 2019-07-22 18:27:07 | 6011381817520024 | fraud_Haag-Blanda | food_dining | 21.98 | Kristen | Allen | F | 8619 Lisa Manors Apt. 871 | Lagrange | WY | 82221 | 41.6423 | -104.1974 | 635 | Product/process development scientist | 1973-07-13 | fd19c51e0b694609f42034aa3bf1830a | 1342981627 | 41.554200 | -103.287380 | 0 |
| 1 | 1226808 | 2020-05-26 23:20:45 | 180048185037117 | fraud_Pouros-Conroy | shopping_pos | 2.71 | Mary | Wall | F | 2481 Mills Lock | Plainfield | NJ | 7060 | 40.6152 | -74.4150 | 71485 | Leisure centre manager | 1974-07-19 | 133cc647eb444e9344a5c15f6419ce23 | 1369610445 | 41.082326 | -73.784634 | 0 |
| 2 | 410879 | 2019-07-05 11:18:22 | 342351256941125 | fraud_Wiza LLC | misc_pos | 3.34 | Rebecca | Obrien | F | 5619 Mendoza Inlet | Juliette | GA | 31046 | 33.1194 | -83.8235 | 3343 | Theatre manager | 1990-06-08 | dcf293f8901483f86f9627dbd603e0cf | 1341487102 | 32.903079 | -83.781837 | 0 |
| 3 | 481047 | 2019-07-29 15:53:35 | 6011366578560244 | fraud_Jast Ltd | shopping_net | 9.12 | Adam | Stark | M | 0912 Mark Fields Apt. 080 | Mc Veytown | PA | 17051 | 40.5046 | -77.7186 | 4653 | Nutritional therapist | 1997-07-01 | bf7c695b05563c0259f175783b11811c | 1343577215 | 41.125233 | -78.267574 | 0 |
| 4 | 468410 | 2019-07-25 18:11:18 | 3576021480694169 | fraud_Pouros-Conroy | shopping_pos | 4.25 | Dawn | Gray | F | 9486 Joel Common Suite 554 | Topeka | KS | 66618 | 39.1329 | -95.7023 | 163415 | Secondary school teacher | 2004-12-30 | 195277f89533d9a16ff4f3581b629c59 | 1343239878 | 39.968570 | -95.706471 | 0 |
| 5 | 373542 | 2019-06-22 03:05:25 | 38947654498698 | fraud_Spinka Inc | grocery_net | 46.52 | Lori | Rodriguez | F | 12087 Michael Light | Creola | OH | 45622 | 39.3543 | -82.5030 | 321 | Copywriter, advertising | 1979-06-24 | df9eaa22784e41ae68f74c250b9b2fd2 | 1340334325 | 38.616542 | -81.620499 | 0 |
| 6 | 267059 | 2019-05-12 16:09:16 | 377993105397617 | fraud_Gulgowski LLC | home | 12.17 | Nathan | Martinez | M | 586 Thomas Cliffs | Oconto Falls | WI | 54154 | 44.8755 | -88.1555 | 5548 | Mining engineer | 1975-09-11 | 8ae9ac14b13f6f0f1ab262b01dbf007c | 1336838956 | 45.419141 | -87.647265 | 0 |
| 7 | 1045298 | 2020-03-09 15:40:18 | 3596217206093829 | fraud_Terry Ltd | home | 27.05 | Sara | Ramirez | F | 23843 Scott Island | Birmingham | IA | 52535 | 40.8626 | -91.9534 | 888 | Camera operator | 1988-03-25 | 288d1ec2511320a8468523201f07ca48 | 1362843618 | 40.346154 | -91.933066 | 0 |
| 8 | 23805 | 2019-01-14 21:19:48 | 4512828414983801773 | fraud_Bahringer Group | health_fitness | 7.19 | Monica | Cohen | F | 864 Reynolds Plains | Uledi | PA | 15484 | 39.8936 | -79.7856 | 328 | Tree surgeon | 1983-07-25 | 342c50d9c172c7693b729f995be2bd99 | 1326575988 | 40.744057 | -80.696765 | 0 |
| 9 | 618442 | 2019-09-20 09:01:36 | 36153880429415 | fraud_Flatley Group | misc_pos | 10.91 | Erik | Stevens | M | 84033 Pitts Overpass | Lakeland | FL | 33809 | 28.1762 | -81.9591 | 237282 | Plant breeder/geneticist | 1949-10-13 | 4a9b5b066a9c3d0826875fe1381fdf73 | 1348131696 | 28.956241 | -81.077362 | 0 |
| Unnamed: 0 | trans_date_trans_time | cc_num | merchant | category | amt | first | last | gender | street | city | state | zip | lat | long | city_pop | job | dob | trans_num | unix_time | merch_lat | merch_long | is_fraud | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1481905 | 1279423 | 2020-06-15 05:12:39 | 4800395067176717 | fraud_Lockman Ltd | grocery_pos | 143.57 | Daniel | Owens | M | 88794 Mandy Lodge Apt. 874 | Howells | NE | 68641 | 41.6964 | -96.9858 | 1063 | Research scientist (maths) | 1928-04-02 | a454d53ab7be448248aa17ae3d99f05d | 1371273159 | 41.990784 | -97.431909 | 0 |
| 1481906 | 892937 | 2019-12-24 20:34:58 | 30234966027947 | fraud_Morissette PLC | shopping_pos | 5.99 | Matthew | Lambert | M | 7188 Melissa Crest Apt. 151 | New Holstein | WI | 53061 | 43.9446 | -88.0911 | 5196 | Child psychotherapist | 1978-01-22 | b579de399001f0170e700649046caedd | 1356381298 | 43.069894 | -88.627589 | 0 |
| 1481907 | 1122515 | 2020-04-12 17:59:36 | 503874407318 | fraud_Johns Inc | entertainment | 11.31 | Andrew | Mcgee | M | 4130 Tiffany Glen Apt. 562 | San Antonio | TX | 78248 | 29.5894 | -98.5201 | 1595797 | Exhibition designer | 1975-12-28 | 7ed7cf5c241b102d0f858ac141cdf8e4 | 1365789576 | 29.590449 | -98.897342 | 0 |
| 1481908 | 1187684 | 2020-05-11 09:36:12 | 4661996144291811856 | fraud_Kiehn-Emmerich | grocery_pos | 117.10 | Linda | Park | F | 24607 Charles Mountains | Fenelton | PA | 16034 | 40.8555 | -79.7372 | 2054 | Operations geologist | 1963-08-04 | 6587290429dfe378981726d52784118e | 1368264972 | 41.220313 | -80.544994 | 0 |
| 1481909 | 241556 | 2020-09-16 13:12:00 | 180018375329178 | fraud_Conroy Ltd | shopping_pos | 184.69 | Michelle | Woods | F | 952 Joseph Throughway | Munith | MI | 49259 | 42.3703 | -84.2485 | 2523 | Geophysicist/field seismologist | 1988-03-21 | 437d13b6e10cbf3a30888bcf706e0890 | 1379337120 | 42.805996 | -84.762264 | 0 |
| 1481910 | 1043289 | 2020-03-08 23:12:43 | 4306630852918 | fraud_Kozey-McDermott | travel | 7.16 | Maureen | Garza | F | 169 Edward Inlet | Saint Louis | MO | 63131 | 38.6171 | -90.4504 | 927396 | Occupational hygienist | 1960-03-12 | 5632230607c0f634db4625203f3d3d34 | 1362784363 | 38.109008 | -89.917944 | 0 |
| 1481911 | 412764 | 2019-07-06 03:59:03 | 4225990116481262579 | fraud_Hoppe, Harris and Bednar | entertainment | 4.16 | Brian | Simpson | M | 2711 Duran Pines | Honokaa | HI | 96727 | 20.0827 | -155.4880 | 4878 | Physiotherapist | 1966-12-03 | 111685f1f132266a0cecd2a64108fe89 | 1341547143 | 20.741936 | -155.393629 | 0 |
| 1481912 | 75245 | 2019-02-13 20:11:59 | 3521417320836166 | fraud_Turner, Ziemann and Lehner | food_dining | 84.69 | Angela | Hodges | F | 08236 Kim Hill | Indianapolis | IN | 46254 | 39.8490 | -86.2720 | 910148 | Firefighter | 1975-11-30 | 246db647966358be9baa402b13c984d7 | 1329163919 | 40.488363 | -86.228543 | 0 |
| 1481913 | 738664 | 2019-11-11 15:23:03 | 6011381817520024 | fraud_Graham and Sons | health_fitness | 28.85 | Kristen | Allen | F | 8619 Lisa Manors Apt. 871 | Lagrange | WY | 82221 | 41.6423 | -104.1974 | 635 | Product/process development scientist | 1973-07-13 | 0d589021b00957e7341efd7a78fc5e7b | 1352647383 | 41.394869 | -104.797962 | 0 |
| 1481914 | 526735 | 2020-12-25 18:51:59 | 6534628260579800 | fraud_Jacobi and Sons | shopping_pos | 2.71 | Christine | Harris | F | 29606 Martinez Views Suite 653 | Hinesburg | VT | 5461 | 44.3346 | -73.0980 | 4542 | Claims inspector/assessor | 1998-03-19 | 9eb4470ad9e46f8e5b649f10e462988a | 1387997519 | 43.978527 | -73.593372 | 0 |